Query Modification by Discovering Topics from Web Page Structures

نویسندگان

  • Satoshi Oyama
  • Katsumi Tanaka
چکیده

We propose a method that identifies from Web pages pairs of keywords in which one word describes the other and uses these relations to modify the query. It takes into account the positions of the words in the page structures when counting their occurrences and applies statistical tests to examine the differences between word co-occurrence rates. It finds related keywords more robustly regardless of the word type than the conventional methods, which do not consider page structures. It can also identify subject and description keywords in the user’s input and find additional keywords for detailing the query. By considering the document structures, our method can construct queries that are more focused on the user’s topic of interest.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology Driven Focused Crawling of Web Documents

In recent year dynamism of the World Wide Web , the issue of discovering relevant web pages has become an important challenge. Focused crawler aims at selectively seeking out pages that are relevant to a pre-defined set of topics. Most of the current approaches perform syntactic matching, that is, they retrieve documents that contain particular keywords from the user’s query. This often leads t...

متن کامل

Discovering Topics to Enhance Communities of Creation from Links to the Future

The World Wide Web is a great source of new topics significant for trend birth/creation. Here we propose a method for discovering such topics from the web. The obtained web pages absorb attentions of people from multiple interestcommunities, to enforce the spread of latent interest trends. Topics in such pages can be triggers for personal/social progress of interests, beyond the bounds of exist...

متن کامل

Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...

متن کامل

Quality Evaluation of Search Results by Typicality and Speciality of Terms Extracted from Wikipedia

In Web search, it is often difficult for users to judge which page they should choose among search results and which page provides high quality and credible content. For example, some results may describe query topics from narrow or inclined viewpoints or they may contain only shallow information. While there are many factors influencing quality perception of search results, we propose two impo...

متن کامل

Discovering Seeds of New Interest Spread from Premature Pages Cited by Multiple Communities

The World Wide Web is a great source of new topics signi cant for trend birth and creation. In this paper, we propose a method for discovering topics, which stimulate communities of people into earnest communications on the topics' meaning, and grow into a trend of popular interest. Here, the obtained are web pages which absorb attentions of people from multiple interest-communities. It is show...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004